Skip to content

Mixed deltas plot (rewritten)#511

Merged
fweber144 merged 21 commits into
stan-dev:masterfrom
fweber144:mixed_deltas_plot_rewritten
Jun 6, 2025
Merged

Mixed deltas plot (rewritten)#511
fweber144 merged 21 commits into
stan-dev:masterfrom
fweber144:mixed_deltas_plot_rewritten

Conversation

@fweber144
Copy link
Copy Markdown
Collaborator

@fweber144 fweber144 commented Apr 23, 2025

This is a rewritten version of the mixed deltas plot (#508, originally proposed in #496) that comes with less changes compared to master and should be less error-prone.

@avehtari, could you check if the plots created by this version are ok for you? Especially the axis labels and the plot subtitle would need to be checked (there are a few changes in the labels/subtitle compared to #508).

Here, the new column behavior of the performances() table (which is essentially the summary.vsel() table) implemented in #508 is not included. I would suggest to create a separate PR for that new column behavior if it is still desired (personally, I would prefer not to change it because it has been this way for a long time, already before I started working on projpred).

This PR is not completely ready to be merged. There are a few things I still need to do:

  • Search for deltas in the whole codebase, especially to ensure that the docs and the vignettes are updated appropriately.
  • Add new unit tests.
  • Add a NEWS.md entry.

I would tackle these if @avehtari confirms that the general implementation is ok. If we go with this version, #508 would have to be closed (not merged).

@fweber144 fweber144 requested a review from avehtari April 23, 2025 10:51
@avehtari
Copy link
Copy Markdown
Member

With this PR branch, I get an error when using deltas="mixed" with more than stats

+ Error in plot.vsel(cvvs_fast, stats = c("elpd", "R2"), deltas = "mixed",  (from methods.R#771) : 
  Unexpected length of `stats_bs_init$value`. Please report this.

(e.g. using the example in the vignette)

@avehtari
Copy link
Copy Markdown
Member

The github PR review system is broken and if I try to make comments specific to certain lines my browser freezes (tested both Chrome and Firefox):

methods.R lines 1110-1125:
This produces a text that this too long if deltas="mixed". If the additional text is shown, there should be newline after the comma, to make the text about the same length as texts below the plot

@avehtari
Copy link
Copy Markdown
Member

Here, the new column behavior of the performances() table (which is essentially the summary.vsel() table) implemented in #508 is not included. I would suggest to create a separate PR for that new column behavior if it is still desired (personally, I would prefer not to change it because it has been this way for a long time, already before I started working on projpred).

Without that change to the table, how do you get the values for the intervals corresponding to the plot?

@fweber144
Copy link
Copy Markdown
Collaborator Author

methods.R lines 1110-1125: This produces a text that this too long if deltas="mixed". If the additional text is shown, there should be newline after the comma, to make the text about the same length as texts below the plot

Thanks; fixed in 5638c09.

@fweber144
Copy link
Copy Markdown
Collaborator Author

With this PR branch, I get an error when using deltas="mixed" with more than stats

+ Error in plot.vsel(cvvs_fast, stats = c("elpd", "R2"), deltas = "mixed",  (from methods.R#771) : 
  Unexpected length of `stats_bs_init$value`. Please report this.

(e.g. using the example in the vignette)

Thanks; fixed in ee57c83.

@fweber144
Copy link
Copy Markdown
Collaborator Author

Here, the new column behavior of the performances() table (which is essentially the summary.vsel() table) implemented in #508 is not included. I would suggest to create a separate PR for that new column behavior if it is still desired (personally, I would prefer not to change it because it has been this way for a long time, already before I started working on projpred).

Without that change to the table, how do you get the values for the intervals corresponding to the plot?

From the diff columns (as in #508, I have added diff columns diff.lq and diff.uq to .tabulate_stats() output; I have also ensured that argument baseline is respected in that case).

(a merge operation was necessary instead of simply omitting the error; see
discussion at stan-dev#511)
@fweber144
Copy link
Copy Markdown
Collaborator Author

With this PR branch, I get an error when using deltas="mixed" with more than stats

+ Error in plot.vsel(cvvs_fast, stats = c("elpd", "R2"), deltas = "mixed",  (from methods.R#771) : 
  Unexpected length of `stats_bs_init$value`. Please report this.

(e.g. using the example in the vignette)

Thanks; fixed in ee57c83.

That was a bit too hasty. I didn't realize that a merge operation was necessary (this was the reason why I added the error back then). Fixed now in b86caaa.

@avehtari
Copy link
Copy Markdown
Member

avehtari commented May 21, 2025

deltas="mixed" does not work correctly for stats="gmpd" (EDIT: this used to work in my PR)
image

@avehtari
Copy link
Copy Markdown
Member

With student case study https://users.aalto.fi/~ave/casestudies/VariableSelection/student.html, this worked in my PR

plot(vselm, stats=c("R2"), deltas=FALSE)

but with this PR gives an error

Error in sqrt_cut0((value_se^2 - 2 * mse_e/mse_y * cov_mse_e_y + (mse_e/mse_y)^2 *  (from misc.R#767) : 
  Negative (and numerically non-zero) value used as input to sqrt_cut0().

@fweber144
Copy link
Copy Markdown
Collaborator Author

deltas="mixed" does not work correctly for stats="gmpd" (EDIT: this used to work in my PR)

Thanks; fixed in 69b349b.

@fweber144
Copy link
Copy Markdown
Collaborator Author

With student case study https://users.aalto.fi/~ave/casestudies/VariableSelection/student.html, this worked in my PR

plot(vselm, stats=c("R2"), deltas=FALSE)

but with this PR gives an error

Error in sqrt_cut0((value_se^2 - 2 * mse_e/mse_y * cov_mse_e_y + (mse_e/mse_y)^2 *  (from misc.R#767) : 
  Negative (and numerically non-zero) value used as input to sqrt_cut0().

I get the same error with your PR. I guess you ran the code from your PR when commit f132e70 was not yet included in upstream/master (and in your PR branch neither, because I tried to keep your PR branch up-to-date with upstream/master). Note that the corresponding lines

projpred/R/summary_funs.R

Lines 441 to 443 in 0362770

value_se <- sqrt_cut0((value_se^2 -
2 * mse_e / mse_y * cov_mse_e_y +
(mse_e / mse_y)^2 * var_mse_y) / mse_y^2)
have also been modified by #514, but that didn't change the behavior of those lines (sqrt_cut0() was added so that cutting off negative variances can be done at other places in the same manner).

Now the question is how we should fix that (on upstream/master). We could simply increase the tolerance used inside of sqrt_cut0(), which is currently at sqrt(.Machine$double.eps) (which is 1.5e-08 on my machine). In your case, we would have to increase it at least to 8.6e-06 (the value of abs(x) in your case), so 1e3 * sqrt(.Machine$double.eps) might make sense. Is that ok for you?

@avehtari
Copy link
Copy Markdown
Member

1e3 * sqrt(.Machine$double.eps) might make sense. Is that ok for you?

ok

@fweber144
Copy link
Copy Markdown
Collaborator Author

1e3 * sqrt(.Machine$double.eps) might make sense. Is that ok for you?

ok

Fixed in 4daed93 (and merged upstream/master into this PR branch), so the issue should not occur anymore (I checked this on my machine).

@avehtari
Copy link
Copy Markdown
Member

avehtari commented Jun 3, 2025

I just tested plotting with all different stats using Portuguese student data (for continuos data) and Diabetes for binary data, and all did work well

@fweber144
Copy link
Copy Markdown
Collaborator Author

I just tested plotting with all different stats using Portuguese student data (for continuos data) and Diabetes for binary data, and all did work well

Thanks. I finished the open to-do items from above and will merge now.

@fweber144 fweber144 merged commit 8f63a6b into stan-dev:master Jun 6, 2025
@fweber144 fweber144 deleted the mixed_deltas_plot_rewritten branch June 6, 2025 13:46
fweber144 added a commit that referenced this pull request Jun 7, 2025
fweber144 added a commit that referenced this pull request Jun 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants